Quantitative association of vocal-tract and facial behavior
نویسندگان
چکیده
This paper examines the degrees of correlation among vocal-tract and facial movement data and the speech acoustics. Multilinear techniques are applied to support the claims that facial motion during speech is largely a byproduct of producing the speech acoustics and further that the spectral envelope of the speech acoustics can be better estimated by the 3D motion of the face than by the midsagittal motion of the anterior vocal-tract (lips, tongue and jaw). Experimental data include measurements of the motion of markers placed on the face and in the vocal-tract, as well as the speech acoustics, for two subjects. The numerical results obtained show that, for both subjects, 91% of the total variance observed in the facial motion data could be determined from vocal-tract motion by means of simple linear estimators. For the inverse path, i.e. recovery of vocal-tract motion from facial motion, the results indicate that about 80% of the variance observed in the vocal-tract can be estimated from the face. Regarding the speech acoustics, it is observed that, in spite of the nonlinear relation between vocal-tract geometry and acoustics, linear estimators are sucient to determine between 72 and 85% (depending on subject and utterance) of the variance observed in the RMS amplitude and LSP parametric representation of the spectral envelope. A dimensionality analysis is also carried out, and shows that between four and eight components are sucient to represent the mappings examined. Finally, it is shown that even the tongue, which is an articulator not necessarily coupled with the face, can be recovered reasonably well from facial motion since it frequently displays the same kind of temporal pattern as the jaw during speech. Ó 1998 Elsevier Science B.V. All rights reserved.
منابع مشابه
Ramsay Hunt Syndrome Associated with True Vocal Cord Palsy- A Case Report
Introduction: Varicella-zoster virus may cause an infectious disease called Ramsay Hunt syndrome. The related symptoms include facial nerve palsy (FNP), otalgia, the vesicular eruptions of the auricle and external auditory canal, less common ocular movement disorder, facial hypoesthesia, myofascial pain, vestibular symptoms, hearing loss, dysphasia, vocal cord paralysis, as well as tongue paral...
متن کاملEmotion-recognition abilities and behavior problem dimensions in preschoolers: evidence for a specific role for childhood hyperactivity.
Facial emotion-recognition difficulties have been reported in school-aged children with behavior problems; little is known, however, about either this association in preschool children or with regard to vocal emotion recognition. The current study explored the association between facial and vocal emotion recognition and behavior problems in a sample of 3 to 6-year-old children. A sample of 57 c...
متن کاملEffects of Voice Therapy on Vocal Tract Discomfort in Muscle Tension Dysphonia
Introduction: Patients with muscle tension dysphonia (MTD) suffer from several physical discomforts in their vocal tract. However, few studies have examined the effects of voice therapy (VT) on the vocal tract discomfort (VTD) in patients with voice disorders. Therefore, the aim of the present study was to investigate the effects of VT on the VTD in patients with MTD. Materi...
متن کاملUnified physiological model of audible-visible speech production
In this paper, vocal tract and orofacial motions are measured during speech production in order to demonstrate that vocal tract motion can be used to estimate its orofacial counterpart. The inversion, i.e. vocal tract behavior estimation from orofacial motion, is also possible, but to a smaller extent. The numerical results showed that vocal tract motion accounted for 96% of the total variance ...
متن کاملIs visual reference necessary? Contributions of facial versus vocal cues in 12-month-olds' social referencing behavior.
To examine the influences of facial versus vocal cues on infants' behavior in a potentially threatening situation, 12-month-olds on a visual cliff received positive facial-only, vocal-only, or both facial and vocal cues from mothers. Infants' crossing times and looks to mother were assessed. Infants crossed the cliff faster with multimodal and vocal than with facial cues, and looked more to mot...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Speech Communication
دوره 26 شماره
صفحات -
تاریخ انتشار 1998